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Abstract — Consider a set of networked agents endowed with 
private cost functions and seeking to find a consensus on 
tlie minimizer of the aggregate cost. A new class of random 
asynchronous distributed optimization methods is introduced. 
The methods generaUze the standard Alternating Direction 
Method of Multipliers (ADMM) to an asynchronous setting 
where isolated components of the network are activated in an 
uncoordinated fashion. The algorithms rely on the introduction 
of randomized Gauss-Seidel iterations of a Douglas-Rachford 
operator for finding zeros of a sum of two monotone operators. 
Convergence to the sought minimizers is provided under mild 
connectivity conditions. Numerical results sustain our claims. 

I. Introduction 

Consider a network represented by a set V of agents 
seeking to solve the following optimization problem on a 
Euclidean space X: 



inf ^/„(a:) 



(1) 



where /„ is a convex real function known by agent v only. 
Function fy can be interpreted as the price payed by an agent 
V when the global network state is equal to x. 

This problem arises for instance in cloud learning appli- 
cations where massive data sets are distributed in a network 
and processed by distinct virtual machines [1]. We inves- 
tigate distributed optimization algorithms: agents iteratively 
update a local estimate using their private objective /„ and, 
simultaneously, exchange information with their neighbors 
in order to eventually reach a consensus on the global 
solution. Standard algorithms are generally synchronous: all 
agents are supposed to complete their local computations 
synchronously at each tick of an external clock, and then 
synchronously merge their local results. However, in many 
situations, one faces variable sizes of the local data sets 
along with heterogeneous computational abilities of the 
virtual machines. Synchronism then becomes a burden, as 
the global convergence rate is expected to depend on the 
local computation times of the slowest agents. It is crucial to 
introduce asynchronous methods which allow the estimates 
to be updated in a non-coordinated fashion, rather than all 
together or in some frozen order. 

The literature contains at least three classes of distributed 
optimization methods for solving ([T]i. The first one is based 
on the simultaneous use of a local first-order optimization 
algorithm (subgradient algorithm [2], [3], [4], Nesterov-like 
method [5], [6]) and a gossip process which drives the 
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network to a consensus. A second class of methods is formed 
by distributed Newton-Raphson methods [7]. This paper 
focuses on a third class of methods derived from proximal 
splitting methods [8], [9], [10]. Perhaps the most emblematic 
proximal splitting method is the so-called Alternating Direc- 
tion Method of Multipliers (ADMM) recently popularized to 
multiagent systems by the monograph [11]. Schizas et al. 
demonstrated the remarkable potential of ADMM to handle 
distributed optimization problems and introduce a useful 
framework to encompass graph-constrained communications 
[12]. We also refer to [13], [14] for recent contributions. 
However, all of these works share a common perspective: Al- 
gorithms are synchronous. They require a significant amount 
of coordination or scheduling between agents. In [12], [13], 
agents operate in parallel, whereas [14] proposes a sequential 
version of ADMM where agents operate one after the other 
in a predetermined order. 

Contributions. This paper introduces a novel class of dis- 
tributed algorithms to solve ([T]l. The algorithms are asyn- 
chronous in the sense that some components of the network 
are allowed to wake up at random and perform local updates, 
while the rest of the network stands still. No coordinator 
or global clock is needed. The frequency of activation 
of the various network components is likely to vary. The 
algorithms rely on the introduction of randomized Gauss- 
Seidel iterations of a Douglas-Rachford monotone operator. 
We prove that the latter iterations provides a new powerful 
method for finding the zeros of a sum of two monotone 
operators. Application of our method to problem ([T]) yields 
a randomized ADMM-like algorithm, which is proved to 
converge to the sought minimizers. 

The paper is organized as follows. The distributed op- 
timization problem is rigorously stated in Section |ll] The 
synchronous ADMM algorithm that solves this problem is 
then described in Section [III] Section |IV] forms the core 
of the paper After quickly recalling the monotone operator 
formalism, the random Gauss-Seidel form of the proximal 
algorithm is described and its convergence is shown there. 
These results will eventually lead to an asynchronous version 
of the well-known Douglas-Rachford splitting algorithm. In 
Section |Vj the results of Section IV are applied towards de- 
veloping an asynchronous version of the ADMM algorithm. 
An implementation example is finally provided in Section VI 
along with some simulations in Section [VH] 

Notations 

Consider a non-directed graph G — {V, E) where V is 
a set of vertices and E a set of edges. We sometimes note 



V ^ w for {v, w} E E. For any A C T^, we denote by G{A) 
the subgraph of G induced by A {i.e., G{A) has vertices A 
and for any (u, w) E P?, \y, w} is an edge of G{A) if and 
only if it is an edge of G). Let X be a Euclidean space. We 
denote by X'^ the set of functions on A ^ X. It is endowed 
with the inner product (x, y) a = where 
( . , . )x is the inner product on X. We will omit subscripts 
X and A when no confusion occurs. For any finite collection 
Ai,- - ■ ,Al C V, we endow the space X^^ x • • • x X^^ 
with the scalar product {x,y) — '^i^i{xi,ye)Ai for any 
X = {xi, ■ ■ ■ ,xl) and y = {yi, ■ ■ ■ ,yL). 

We denote by Il^a; the restriction of a; to A i.e.. Ha ■ 
X^ — > X'^ is the linear operator defined for any x E X^ 
as Hax : {v E A) x{v). We denote by 1a E X^ the 
constant function equal to one and by sp(l^) the linear span 
of 1a i.e., the set of constant functions on A. Notation \A\ 
represents the cardinal of a set A. 

For a closed proper convex function /i : X — > (—00, +00] 
we define pmXf^ p{x) — argminy h{y) + f Ijy — a;|p. 

II. Distributed Optimization on a Graph 

Consider a network of agents represented by a non- 
oriented graph G = {V, E) where V is a finite set of vertices 
{i.e., the agents) and i? is a set of edges. Each agent v eV 
has a private cost function : X — > (—00, +00] where X 
is a Euclidean space. We make the following assumption on 
functions /„. 

Assumption 1: 

i) For all v E V, is a proper closed convex function. 

ii) The infimum in ([T]) is finite and is attained at some 
point X* E X. 

In order to solve the optimization problem ([T]) on the graph 
G, we first provide an equivalent formulation of ^ that will 
be revealed useful. For some integer L > 1, consider a finite 
collection Ai,A2,--- , Al of subsets of V which we shall 
refer to as components. We assume the following condition. 

Assumption 2: i) [Ji^i Ai = V. 
ii) Ufci C!{Ai) is connected. 

Assumption |2}') implies that any vertex appears in one of 
the components Ai, - ■ ■ ,Al at least. We stress the fact that 
two distinct components Ag and Ag' are not necessarily 
disjoint, though. Assumption |2]ii) means that the union of all 
subgraphs is connected. As the latter union is also a subgraph 
of G, this implies that G is connected. As will be made 
clear below, our algorithms shall assume that all agents in 
the same component are able to perform simple operations 
in a coordinated fashion {i.e., compute a local average over 
a component). Thus, in practice, it is reasonable to require 
that each subgraph G{Ai) is itself connected. 

We introduce some notations. We set for any x E X^ , 

/(^) = E/-w«)) ■ 

vev 



For any z — {zi, - ■ ■ , zj^) eZ = X^^ x • • • x X"^^, we define 
the closed proper convex function 

L 

where lh is the indicator function of a set H (equal to zero 
on H and to +00 outside). Here g{z) is equal to zero if 
for any £, zi is constant. Otherwise, g{z) is infinite. For any 
X E Xy , we define Mx = (Il^j^a;, • • • , Ilyi^a;). We consider 
the following optimization problem: 

inf ^{x)+g{Mx) (2) 

Lemma 1: Under Assumption |2] x is a minimizer ofj2]) 
if and only ii x — xly where x G X is a minimizer of (mT 
Proof: Let x E XY such that g{Mx) is finite. Then 
X is constant on each component. Let w,u> be two arbitrary 
vertices in V . There exists a path in IJ^^j^ G{Ai) connecting 
V and w. Each edge of this path connects two vertices which 
belong to a common component. Thus, x is constant on two 
consecutive vertices of the path. This proves that x{v) = 
x{w). Thus, X is constant and the result follows. ■ 
As noted in [9], solving Problem Q is equivalent to the 
search of the zeros of two monotone operators. One of 
possible approaches for that sake is to use ADMM. Although 
the choice of the sets Ai , • • • , Al does not change the 
minimizers of the initial problem, it has an impact on the 
particular form of ADMM used to find these minimizers, as 
we shall see below. 

In order to be more explicit, we provide in this section two 
important examples of possible choices for the components 
Ai,--- ,Al. 

Example 1: Let L = 1 and Ai = V. Problem Q writes 

inf /(x) + t,p(i,,)(a;) , 

In this case, the formulation is identical to [11, Chapter 7]. 

Example 2: Let L = \E\ and {Ai, ■ ■ ■ ,Al} = E. That 
is, each set A( is a pair of vertices {v,w} such that {v,w} 
is an edge. Problem (j2|i writes 

J^l /(-) + E ^spd.) [X]) 

where I2 stands for the vector (1,1)^. 

III. Synchronous ADMM 

A. General facts 

We now apply the standard ADMM to Problem Q. 
Perhaps the most direct way to describe ADMM is to 
reformulate the unconstrained problem (|2]) into the following 
constrained problem: Minimize f{x) + 17(2) subject to z = 
Mx. For any x E XY , \,zEZ, the augmented Lagrangian 
is given by 

Cp{x,z-X) ^f{x)+g(z) + {\Mx~z) + ^ ||Afx- (3) 



where p > is a constant. ADMM consists of the iterations 



= argmin Cp{x, z''; A*") 
z^^^ = argmin £p(x''+\z; A*") 
A'^+i = A'= + p (Mx'^+i - z'^+i) 



(4a) 
(4b) 
(4c) 



From [11, Chap. 3.2], the following result is immediate. 

Theorem 1: Under Assumption [l] the sequence {x^) de- 
fined in (|4a]l converges to a minimizer of (|2]). 

B. Decentralized Implementation 

One should now make (|4]) more explicit and convince the 
reader that the iterations are indeed amenable to distributed 
implementation. Due to the specific form of function g, it 
is clear from (4bi that all components z^, • • • , of z^ are 
constant. Otherwise stated, = (zfl^j,--- for 
some constants z^ E X. For any v E V, we set 

<j{v) ^{£ -.veA,} . 



Now consider the first update equation ( [4a) i. Getting rid of all 
quantities in Cp which do not depend on the uth component 
of X, we obtain for any v E V 



(v) = argmin fy{y) 



yex 



^ {Kiv),y) + '^\\y^z^\\ 



After some algebra, the above equation further simplifies to 



(5) 



where we introduced the following constants: 



^ E^^^'H-:^ E^'(^)- (6) 



p\a[v)\ 



It is straightforward to show that the second update equation 



(4b I admits as well a simple decomposable form. After some 



algebra, we obtain that for any i = 1 , • • • , i. 



— y 



(v) 



(7) 



Finally, for all ^ = 1, • • • , L and v e A^, equation (4c i reads 

\\-^\v) = \\(v) + p{x^^\v) - z^i) . (8) 

Averaging (|8jl w.rt. v and using ^\ yields X^t^eA^ -^K^) = 0- 
Thus, the second term in the RHS of (|7| can be deleted. 
Finally, averaging ([8]) w.rt. i leads to 



B 



[:v) = B''{v) + x^+^[v)-Z''^\v) 



(9) 



Synchronous ADMM: 

At each iteration fc. 

For each agent v, compute x^^'^iv) using (jsj). 
In each components £ = 1 , • • • , L, compute 



\A 



4»l ^ 



(^) 



For each agent t;, compute Z^'^^{v) and _B'''+^(w) using ^ 
and (j9]) respectively. 

The above algorithm implicitly requires the existence of 
a routine for computing an average, in each component Ag. 
This requirement is mild when the components coincide with 
edges of the graph as in Example 2. In this case, one only 
needs that the two vertices of an edge share their current 
estimate and find an agreement on the average. In the general 
case, the objective can be achieved by selecting a leader 
in each component whose role is to gather the estimates, 
compute the average and send the result to all agents in this 
component. 

It is worth noting that in the case of Example 1, the 
synchronous ADMM described above coincides with the 
algorithm of [11]. 

IV. A Randomized Proximal Algorithm 
A. Monotone operators 

An operator T on a Euclidean space Y is a set valued 
mapping T : Y — >^ 2"^. An operator can be equivalently 
identified with a subset of Y x Y, and we write [x, y) e T 
when y G T(a;). Given two operators Ti and T2 on Y and 
two real numbers ai and a2, the operator aiTi + a2T2 is 
defined as aiTi + a2'^2 = {{x,aiyi + 022/2) : {x,yi) G 
Ti, (a;, 1/2) G T2}. The identity operator is I = {(a;, x) : x E 
Y} and the inverse of the operator T is T~^ = {{x,y) : 
{y, x) E T}. The operator T is said monotone if 

y {x,y),{x',y')ET, {x - x' ,y - y') > 0. 

A monotone operator is said maximal if it is not strictly 
contained in any monotone operator (as a subset of Y x Y). 
Finally, T is said firmly non-expansive if 

V {x, y), {x, y') G T, {x - x , y ~ y) > \\y - y'f. 

The typical example of a monotone operator is the subd- 
ifferential df of a convex function / : Y — > R. Finding 
a minimum of / amounts to finding a point in zer{df), 
where zer(T) — {x : E T{x)} is the set of zeroes of 
an operator T. A common technique for finding a zero of 
a maximal monotone operator T is the so-called proximal 
point algorithm [15] that we now describe. The resolvent of 
T is the operator J^j = (l + pT)~^ for p > 0. One key result 
(see e.g. [9]) says that T is maximal monotone if and only 
if JpT is firmly non expansive and its domain is Y. Observe 
that a firmly non expansive operator is single valued and 
denote by fix(JpT) the set of fixed points of Jpj. It is clear 
that fix(JpT) — zer(T). The firm non expansiveness of Jpj 
plays a central role in the proof of the following result: 

Lemma 2 (Proximal point algorithm [15]): If T is a 
maximal monotone operator and p > 0, then the iterates 
i^fe+i _ j^^^^fe-j starting at any point of Y converge to a 
point of fix(JpT) whenever this set is non-empty. 



B. Random Gauss-Seidel iterations 

Assume now that the Euclidean space Y is a Cartesian 
product of Euclidean spaces of the form Y = Yi x • • • x 
where L is a given integer, and write any ( E Y as ( = 
(Ci, ■ ■ • , Cl) where g Y^ for £ = 1, . . . , L. Let S be a 
firmly non expansive operator on Y and write 

S(C)-(Si(C),...,Sl(C)) 

where Si{Q € Y^. For £ = 1, . . . ,L, define the single valued 
operator : Y — > Y as 



S,(C) = (Ci,...,C£-i,S,(C),C£+i,---,Cl) 



(10) 



Considering an iterative algorithm of the form ^^^+1 = S(C'^), 
its Gauss-Seidel version would be an algorithm of the form 
^A:+i _ Sl o • • • o Si(C'^). We are interested here in a 
randomized version of these iterates. On a probability space 
(ri,J^, P), let {£,'')ke'M be a random process satisfying the 
following assumption: 

Assumption 3: The random variables ^'^ are independent 
and identically distributed. They are valued in the set 
{!,...,£} with P[^i = ^] = p£ > for all £ = 1, . . . , i. 
We are interested here in the convergence of the random 
iterates C'^+i =^ S^fc+i(C'^) towards a (generally random) 
point of fix(S), provided this set is non empty: 

Theorem 2 (Main result): Let S is a firmly non-expansive 
operator on Y with domain Y. Let {^'')kefi be a sequence 
of random variables satisfying Assumption [3] Assume that 
fix(S) 7^ 0. Then for any initial value the sequence 
of iterates C'^+i = S^k+i{('^) converges almost surely to a 
random variable supported by fix(S). 

Proof: Denote by (C,??) = E£=i(0,%)y^ the inner 
product of Y, and by — (C,C) its associated squared 
norm. Define a new inner product C*?] — X^fci Pj^iCt: Ve)yi 
on Y, and let |||C|||^ = C • C be its associated squared 
norm. Fix (* in fix(S). Conditionally to the sigma-field 
Tk ==cr(C\...,f'=) we have 



£=1 

= j2m(-\\sdc') - c;ii^, + E -iicf - ctwi 

L 

= \\s{e)-cr+Y. 



i=l 



Pe 



\Q-C 



e llYf 



= lie'' - cm + IIS(C'') " Cll' - lie'' - CIP 

Since (I - S)(C) = 0, we have 

I|S(C')-CIP-||C'-C1I' 

= ||S(c'=)-C' + C'-ClP-||C'-Cf 

= l|S(C'=)-C'lP + 2(S(C'=)-C',C'-0 

= ||S(c'=) ~C'r~ 2((i - S)(c'-) - (I - S)(C), - C) 

<-||S(C'=)-Cll' 



where the inequality comes from the easily verifiable fact 
that (I — S) is firmly non-expansive when S is. This leads to 
the inequality 

E[iiic'+'-ciiri-^fc] < iiic'-ciir-iis(c')-c'ii' (11) 

which shows that IHC*^ — C*|||^ is ^ nonnegative supermartin- 
gale with respect to the filtration {Tk}- As such, it converges 
with probability one towards a random variable X(^* satisfy- 
ing < Xi^* < oo almost everywhere. Given a countable 
dense subset H of fix(S), there is a probability one set 
on which |||C'' - C||| ^ ^ [0'°°) for C & H. Let 
e fix(S), let e > 0, and choose C, ^ H such that 
IK* ~ cm 1^ £■ With probability one, we have 

|||C'-C||| < IIIC'-Clll + IIIC-CIII <^c + 2e 

for k large enough. Similarly, — C,*\\^ > — 2£ for k 
large enough. We therefore obtain: 

CI : There is a probability one set on which — C*||| 
converges for every C* e fix(S). 
Getting back to Inequality ( [TT] i, taking the expectations on 
both sides of this inequality and iterating over fc, we obtain 



oo 

E 

fc=0 



E[||s(c'=)-ciii<(c"-cr 



By Markov's inequality and Borel Cantelli's lemma, we 
therefore obtain: 

CI : S{C,^)- almost surely. 

We now consider an elementary event in the probability one 
set where CI and C2 hold. On this event, since — C*||| 
converges for C* e fix(S), the sequence C*^ is bounded. 
Since S is firmly non expansive, it is continuous, and C2 
shows that all the accumulation points of C,^ are in fix(S). 
It remains to show that these accumulation points reduce 
to one point. Assume that C* is an accumulation point. 
By CI, IIIC*'' — Ci III converges. Therefore, lim |||(*-' — ||| = 
liminf |||(''^ — Q^^j^ — 0, which shows that Ci is unique. ■ 

V. Random ADMM 

We now return to the optimization problem (|2]i. It is a 
well known fact that the standard ADMM can be seen as 
special case of the so-called Douglas -Rachford algorithm [9]. 
The Douglas-Rachford algorithm can itself be seen as a 
special case of a proximal point algorithm. By the results 
of the previous section, this suggests that random Gauss- 
Seidel iterations applied to the Douglas-Rachford operator 
produce a sequence which eventually converges to the sought 
solutions. It turns out that the latter random iterations can be 
written under the form of practical asynchronous ADMM- 
like algorithm. 

A. Douglas-Rachford operator 

Consider the following dual problem associated with (j2]) 

minr(-iirA)+.g*(A), (12) 

AG Z 

where f*,g* are the Fenchel conjugates of / and g and 
M* is the adjoint of A/. By Assumption [T] along with [16, 



Th.3.3.5], the minimum in ( [T2) i is attained and its opposite 
coincides with the minimum of (|2]). Note that A is a mini- 
mizer of (12i iff zero belongs to the subdifferential of the 
objective function in ([12]). By [16, Th.3.3.5] again, this reads 
e -M ■ df*{-M*X) + dg*{X). Otherwise stated, finding 
minimizers of the dual problem ( [T2| boils down to searching 
zeros of the sum of two maximal monotone operators T + U 
defined by T = -M ■ df* o {~M*) and U = dg*. For a 
fixed p > 0, the Douglas-Rachford / Lions-Mercier operator 
R is defined as 

{{i^ + pb, p. — ly) : (/i, 6) G U, {i^, a) e T ,1^ + pa = p — pb} . 

The following Lemma is an immediate consequence of [9]. 

Lemma 3: Under Assumption [T] R is maximal monotone, 
and zer(R) 7^ 0. Moreover, Jpu(C) G zer(T + U) for any 
C € zer(R). 

Lemma [3] implies that the search for a zero of T + U boils 
down to the search of a zero of R up to a resolvent step 
JpU. To that end, a standard approach is to use a proximal 
point algorithm of the form (■''"+1 — Jr(C'=). By [9], it can 
be shown that this approach is equivalent to the ADMM 
derived in Section |ll] Here, our aim is different. We shall 
consider random Gauss-Seidel iterations in order to derive 
an asynchronous version of the ADMM. 

B. Random Gauss-Seidel Iterations 

Define S = Jr as the resolvent associated with the 
Douglas-Rachford operator R. On the space Z = X'^i x • • • x 
X"^^, define the operator as in (T0\ for any £ — 1, - ■ ■ ,L. 
Let ('C*'')fcg]N be a random process satisfying Assumption [3] 
The following result is a consequence of Theorem |2] com- 
bined with Lemma [3] 

Theorem 3: Let Assumptions [T] |2] and [3] hold true. Con- 
sider the sequence (C*^)*; defined by (■'=+1 = S^fe+i(C'^). 
Then for any initial value the sequence A'' — Jpu(C'') 
converges almost surely to a minimizer of ( [T2] i. 
In order to complete the above result, we still must justify 
the fact that, as claimed, the above iterations can be seen as 
an asynchronous distributed algorithm. 

C. Distributed Algorithm 

We make the above random Gauss-Seidel iterations more 
explicit. In the sequel we shall always denote by Q the £th 
component of a function C G Z i.e., ( = (d, • • • , C,l). For 
any i, we introduce the average (e = Et,eyi« C£(«)/I^f I- 
Lemma|4]below states that any ( G Zis uniquely represented 
by a couple (A, z) G U whose expression is provided. 
Moreover, it provides the explicit form of the ^th block of 
the resolvent S. This shall be the basis of our asynchronous 
distributed algorithm. 

Lemma 4: For any C G Z, the following holds true. 

i) There exist a unique (A, z) e U such that X + pz = 

ii) Jp(7(C) = 

Hi) For any £ = 1, • • • ,L,X£^Q~ QIa^ and zi ^ ^ Iai ■ 
iv) For any £ — 1, - ■ ■ ,L, and any v G Af 



where x{v) is defined by 



5(w) = proxy^ 



.pk(i')l 



\a(v)\ ^ ' 



(14) 



Proof: i)-ii) Existence: Let us define A = Jpu(C) ™d 
z = A)/p. Trivially, X + pz = (. As C & A + pU(A), we 
deduce that (A, z) e U. Uniqueness: For a fixed (A, z) £ (J 
satisfying X + pz = one has ( E {I + pU)(A) and thus 
A = Jpu(C)- As ^ consequence, z = {( — A) /p. 

We use A = Jpu(C) = prox^. ^(C) = C - prox^ ^(C) 
(see [17, Th. 14.3]). As g is the indicator function of the set 
sp(l^j ) X • • • xsp(l^^), proXg p coincides with the projection 
operator onto that set. Thus, for any £, Xi — (i — CflAf ■ The 
expression of z follows from z ~ (C — A)/p. 
iv) Operator S = Jr can be written as 

{{p + pb,iy + pb) : (p, fe) G U, (i^, a) E T,^ + pa = p — pb} . 

Moreover, as R is monotone, S{() is a singleton. Repre- 
senting C = A + pz with (A, z) E U, it follows from the 
above expression of S that ${() = v + pz where v is 
such that V + pa = X — pz for some a E T(zy). Using 
T — ~M ■ df* o (— M*), condition a E translates 
to: there exists x E df*{M*v) s.t. a = -Mx. The output- 
resolvent is obtained by + pz = A + pMx. For a given 



component £, this boils down to equation ( 13 1. The remaining 
task is to provide the expression of x. By the Fenchel- 
Young equality df* = df~^ [16, Prop. 3. 3.4], condition x E 
df*{M*i') is equivalent to M*^ E df{x). Using that v = 
X-p{z~Mx), we obtain E df{x)-M*X+pM*{z~Mx). 
Otherwise stated, x — argmiuj^gx*' ^piVi z: A) where Cp is 
the augmented Lagrangian defined in ([3]). Using the results 



of Section III 



x{v) is given by ( p^ for any v. u 
We are now in position to state the main algorithm. It simply 
consists in an explicit writing of the random Gauss-Seidel 
iterations Q 



k+l 



S^fc+i(C ) using Lemma |4|V). Note that, 
by Lemma |4jj, the definition of a sequence (Cfe)fe on Z is 
equivalent to the definition of two sequences (A'^,z'^) E U 
such that C,^ = A''' + pz^ . Moreover, by Lemma |4|7ij, each 



component z^ of z^ is a constant. The definition of z'^ thus 

, 4 in X. 



reduces to the definition of L constants zj^ , • 
Asynchronous ADMM: _ 



At each iteration k, draw rv. 

For i ~ S,^^^, set for any v E Ag: 

x''+\v) = prox^^^^|,(,)| 



weAe 



Si{C) : V i-> Xi{v) + px{v) 



Xl+\v) = X\v) + p {x''+\v) - z,^+i) . 

For any £ + C'^+i, set A^'+i = X\. 

For any w ^ A^k+i, set x^'^'^iw) = x^{w). 



(13) 



VI. Implementation Example 

In order to illustrate our results, we consider herein an 
asynchronous version of the ADMM algorithm in the context 
of Section [nj-Example |2] The scenario is the following: first. 
Agent V e {1, . . . , \V\} wakes up at time fc + 1 with the 
probability Denoting by My the neighborhood of Agent v 
in the Graph G, this agent then chooses one of its neighbors, 
say w, with the probability l/|A/^t,| and sends an activation 
message to w. In this setting, the edge {v, w} coincides with 
one of the Ag of Example |2] in Section |ll] It is easy to see 
that the samples of the activation process^ who is of course 
valued in E are governed by the probability law 



¥[e = {v,w}] 



> 0. 



\^fv\ Wro 

When the edge {v,w} is activated, the following two prox(-) 
operations are performed by the agents: 



r.k + 1 



{v) 



"'l(|AA(v)| ? 



I ( \J\f(v 



p 



The two agents exchange then the values a;''+^(w) and 
x'^+^(^w) and perform the following operations: 



rk + l 



(w) 



X\v)+p 



We remark that this communication scheme is reminiscent 
of the so-called Random Gossip algorithm introduced in [18] 
in the context of distributed averaging. 

VII. Numerical Results 

We consider a network with V = {1, . . . , 5} and with 
E = {{1, 2}, {2, 3}, {3, 4}, {4, 5}, {5, 3}}. We evaluate the 
behavior of: i) the Synchronous ADMM ii) the Asynchronous 
ADMM and iii) the Distributed Gradient Descent with l/\/k 
stepsize [19] using Random Gossip as a communication 
algorithm[18]. Each agent maintains a different quadratic 
convex function and their goal is to reach consensus over 
the minimizer of problem ([T]). 

In Figure [T] we plot the squared error versus the number of 
primal updates for the three considered algorithms. We ob- 
serve that our algorithm clearly outperforms the Distributed 
Gradient Descent. 
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